自动适应玩家的游戏内容打开新的游戏开发门。在本文中,我们提出了一种使用人物代理和经验指标的架构,这使得能够在进行针对特定玩家人物的程序生成的水平。使用我们的游戏“Grave Rave”,我们证明了这种方法成功地适应了三个不同的三种不同体验指标的基于法则的角色代理。此外,该适应性被证明是特定的,这意味着水平是人的意识,而不仅仅是关于所选度量的一般优化。
translated by 谷歌翻译
Reinforcement learning (RL) has shown great promise with algorithms learning in environments with large state and action spaces purely from scalar reward signals. A crucial challenge for current deep RL algorithms is that they require a tremendous amount of environment interactions for learning. This can be infeasible in situations where such interactions are expensive; such as in robotics. Offline RL algorithms try to address this issue by bootstrapping the learning process from existing logged data without needing to interact with the environment from the very beginning. While online RL algorithms are typically evaluated as a function of the number of environment interactions, there exists no single established protocol for evaluating offline RL methods.In this paper, we propose a sequential approach to evaluate offline RL algorithms as a function of the training set size and thus by their data efficiency. Sequential evaluation provides valuable insights into the data efficiency of the learning process and the robustness of algorithms to distribution changes in the dataset while also harmonizing the visualization of the offline and online learning phases. Our approach is generally applicable and easy to implement. We compare several existing offline RL algorithms using this approach and present insights from a variety of tasks and offline datasets.
translated by 谷歌翻译
Despite the impact of psychiatric disorders on clinical health, early-stage diagnosis remains a challenge. Machine learning studies have shown that classifiers tend to be overly narrow in the diagnosis prediction task. The overlap between conditions leads to high heterogeneity among participants that is not adequately captured by classification models. To address this issue, normative approaches have surged as an alternative method. By using a generative model to learn the distribution of healthy brain data patterns, we can identify the presence of pathologies as deviations or outliers from the distribution learned by the model. In particular, deep generative models showed great results as normative models to identify neurological lesions in the brain. However, unlike most neurological lesions, psychiatric disorders present subtle changes widespread in several brain regions, making these alterations challenging to identify. In this work, we evaluate the performance of transformer-based normative models to detect subtle brain changes expressed in adolescents and young adults. We trained our model on 3D MRI scans of neurotypical individuals (N=1,765). Then, we obtained the likelihood of neurotypical controls and psychiatric patients with early-stage schizophrenia from an independent dataset (N=93) from the Human Connectome Project. Using the predicted likelihood of the scans as a proxy for a normative score, we obtained an AUROC of 0.82 when assessing the difference between controls and individuals with early-stage schizophrenia. Our approach surpassed recent normative methods based on brain age and Gaussian Process, showing the promising use of deep generative models to help in individualised analyses.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
Differentially Private Stochastic Gradient Descent (DP-SGD) is a key method for applying privacy in the training of deep learning models. This applies isotropic Gaussian noise to gradients during training, which can perturb these gradients in any direction, damaging utility. Metric DP, however, can provide alternative mechanisms based on arbitrary metrics that might be more suitable. In this paper we apply \textit{directional privacy}, via a mechanism based on the von Mises-Fisher (VMF) distribution, to perturb gradients in terms of \textit{angular distance} so that gradient direction is broadly preserved. We show that this provides $\epsilon d$-privacy for deep learning training, rather than the $(\epsilon, \delta)$-privacy of the Gaussian mechanism; and that experimentally, on key datasets, the VMF mechanism can outperform the Gaussian in the utility-privacy trade-off.
translated by 谷歌翻译
尽管公平感知的机器学习算法一直在受到越来越多的关注,但重点一直放在集中式的机器学习上,而分散的方法却没有被解散。联合学习是机器学习的一种分散形式,客户使用服务器训练本地模型,以汇总它们以获得共享的全局模型。客户之间的数据异质性是联邦学习的共同特征,这可能会诱导或加剧对由种族或性别等敏感属性定义的无私人群体的歧视。在这项工作中,我们提出了公平命运:一种新颖的公平联合学习算法,旨在实现群体公平,同时通过公平意识的聚合方法维持高效用,该方法通过考虑客户的公平性来计算全球模型。为此,通过使用动量术语来估算公平模型更新来计算全局模型更新,该术语有助于克服嘈杂的非直接梯度的振荡。据我们所知,这是机器学习中的第一种方法,旨在使用公平的动力估算来实现公平性。四个现实世界数据集的实验结果表明,在不同级别的数据异质性下,公平命运显着优于最先进的联邦学习算法。
translated by 谷歌翻译
为了实现良好的性能和概括性,医疗图像分割模型应在具有足够可变性的大量数据集上进行培训。由于道德和治理限制以及与标签数据相关的成本,经常对科学发展进行扼杀,并经过对有限数据的培训和测试。数据增强通常用于人为地增加数据分布的可变性并提高模型的通用性。最近的作品探索了图像合成的深层生成模型,因为这种方法将使有效的无限数据生成多种多样的数据,从而解决了通用性和数据访问问题。但是,许多提出的解决方案限制了用户对生成内容的控制。在这项工作中,我们提出了Brainspade,该模型将基于合成扩散的标签发生器与语义图像发生器结合在一起。我们的模型可以在有或没有感兴趣的病理的情况下产生完全合成的大脑标签,然后产生任意引导样式的相应MRI图像。实验表明,Brainspade合成数据可用于训练分割模型,其性能与在真实数据中训练的模型相当。
translated by 谷歌翻译
深度神经网络在医学图像分析中带来了显着突破。但是,由于其渴望数据的性质,医学成像项目中适度的数据集大小可能会阻碍其全部潜力。生成合成数据提供了一种有希望的替代方案,可以补充培训数据集并进行更大范围的医学图像研究。最近,扩散模型通过产生逼真的合成图像引起了计算机视觉社区的注意。在这项研究中,我们使用潜在扩散模型探索从高分辨率3D脑图像中生成合成图像。我们使用来自英国生物银行数据集的T1W MRI图像(n = 31,740)来训练我们的模型,以了解脑图像的概率分布,该脑图像以协变量为基础,例如年龄,性别和大脑结构量。我们发现我们的模型创建了现实的数据,并且可以使用条件变量有效地控制数据生成。除此之外,我们创建了一个带有100,000次脑图像的合成数据集,并使科学界公开使用。
translated by 谷歌翻译
大多数强化学习算法都利用了经验重播缓冲液,以反复对代理商过去观察到的样本进行训练。这样可以防止灾难性的遗忘,但是仅仅对每个样本都分配了同等的重要性是一种天真的策略。在本文中,我们提出了一种根据样本可以从样本中学到多少样本确定样本优先级的方法。我们将样本的学习能力定义为随着时间的推移,与该样品相关的训练损失的稳定减少。我们开发了一种算法,以优先考虑具有较高学习能力的样本,同时将优先级较低,为那些难以学习的样本,通常是由噪声或随机性引起的。我们从经验上表明,我们的方法比随机抽样更强大,而且比仅在训练损失方面优先排序更好,即时间差损失,这是在香草优先的经验重播中使用的。
translated by 谷歌翻译
预计机器人将掌握形状,重量或材料类型各不相同的广泛物体。因此,为机器人提供类似于人类的触觉功能对于涉及人与人机或机器人与机器人相互作用的应用至关重要,尤其是在那些期望机器人掌握和操纵以前未遇到的复杂物体的情况下。成功的对象掌握和操纵的关键方面是使用配备多个高性能传感器的高质量指尖,在特定的接触表面上适当分布。在本文中,我们介绍了使用两种不同类型的市售机器人指尖(Biotac和wts-ft)的使用的详细分析,每个机器人指尖(Biotac和wts-ft)配备了分布在指尖的接触表面上的多个传感器。我们进一步证明,由于指尖的高性能,不需要一种复杂的自适应抓握算法来抓住日常物体。我们得出的结论是,只要相关的指尖表现出较高的灵敏度,基于比例控制器的简单算法就足够了。在量化的评估中,我们还证明,部分由于传感器的分布,基于BioTAC的指尖的性能优于WTS-FT设备,可以使负载升高至850G,并且简单的比例控制器可以适应该载荷即使对象面临重大的外部振动挑战,也要掌握。
translated by 谷歌翻译